age_count <- data |>group_by(AGE_GROUP) |>summarize(Count =n())plot1 <-ggplot(age_count, aes(x = AGE_GROUP, y = Count)) +geom_bar(stat ="identity", fill ="lightblue") +ggtitle("Number of Arrests by Age group") +xlab("Perp Age Group") +ylab("Number of Arrests")gender_counts <- data |>group_by(PERP_SEX) |>summarize(Count =n()) |>mutate(Percentage = Count /sum(Count) *100,Label =paste0(PERP_SEX, " ", round(Percentage, 1), "%"),Position =cumsum(Count) - Count /2)plot2 <-ggplot(gender_counts, aes(x =2, y = Count, fill = PERP_SEX)) +geom_bar(stat ="identity", width =1, color ="white") +coord_polar("y", start =0) +labs(title ="Perp Gender") +theme_void() +scale_fill_manual(values =c("steelblue", "lightblue")) +geom_text(aes(x =3, y = Position, label = Label), color ="black", size =5) +theme(legend.position ="none") +xlim(0.5, 3)race_count <- data |>group_by(PERP_RACE) |>summarize(Count =n()) |>arrange(desc(Count))plot3 <-ggplot(race_count, aes(x =reorder(PERP_RACE, -Count), y = Count)) +geom_bar(stat ="identity", fill ="lightblue") +ggtitle("Number of Arrests by PERP_RACE") +xlab("PERP_RACE") +ylab("Number of Arrests") +theme(axis.text.x =element_text(angle =45, hjust =1))top_15_ofns_desc <- data |>group_by(OFNS_DESC) |>summarise(Count =n()) |>arrange(desc(Count)) |>head(15)plot4 <-ggplot(top_15_ofns_desc, aes(x =reorder(OFNS_DESC, Count), y = Count)) +geom_bar(stat ="identity", fill ="lightblue") +labs(title ="Top 15 Arrest Categories",x ="Offense Description", y ="Number of Arrests") +coord_flip()layout <-"\nABD\nCCD"(plot1 + plot2 + plot3) + plot4 +plot_layout(design = layout, widths =c(1, 1))
This plot provides a comprehensive overview of demographic patterns and crime-related data.
The top-left plot displays the distribution of arrests by age group, revealing that individuals aged 25-44 represent the highest proportion of arrests, followed by the 18-24 and 45-64 age groups.
The pie chart at the top-right illustrates the gender distribution of arrests, with males accounting for a dominant 82.1% of all arrests, compared to 17.9% for females.
The bottom-left plot focuses on the perpetrator’s race description, showing that Black individuals are most frequently arrested, followed by White Hispanics, Black Hispanics, and other racial categories.
Finally, the bar chart on the bottom-right lists the top 15 offense categories, where Assault 3 & Related Offenses leads, followed by Petit Larceny and Felony Assault.
For the demographic section, it is important to note that the racial categories “Unknown” and “American Indian/Alaskan Native” represent a very small proportion of the overall arrests. Therefore, these groups are excluded from the analysis to focus on the more predominant racial categories.
The demographic distribution chart offers a foundational understanding of the patterns observed in arrests. However, to delve deeper into the interplay between these demographic factors and their combined influence on arrest patterns, the alluvial flow chart offers a more dynamic and interconnected perspective. It allows us to visualize how different demographic categories, such as gender and race, flow into specific age groups, revealing hidden relationships within the data.
This Alluvia diagram clearly shows a dominant overrepresentation of Black and white hispanic males, especially in the 25-44 age group, in arrest data. This trend, combined with the overwhelmingly higher number of male arrests across all races, suggests that certain demographics are being disproportionately affected. In contrast, smaller flows from groups like Asian/Pacific Islanders stand out, emphasizing how arrests are concentrated among specific racial and gender groups. These patterns raise important questions about potential systemic issues or societal factors that may be driving these disparities.
One significant influence on this disparities may be the areas where individuals reside, as different boroughs often reflect varying living standards, access to resources, and community dynamics. To explore this further, the following section will delve into the arrest distribution across different boroughs, investigating on how geography might play a role in these demographic trends.
3.1.3 Mosaic Plot: Borough, Race, and Offense Level
The mosaic plot reveals that while offense levels show consistent patterns across all boroughs - approximately half of the arrests are for felonies, nearly half are for misdemeanors, and violations constitute only a small proportion - there are striking disparities in the racial composition of arrests across boroughs, highlighting that borough is statistically significant to race. For instance, Black individuals dominate arrests in the Brooklyn, with proportions notably higher than in other boroughs, indicating a geographic concentration of demographic composition. In contrast, White individuals are more frequently arrested in Staten Island compared to the others, highlighting localized variations in racial arrest patterns. Similarly, White Hispanic and Black Hispanic individuals exhibit different arrest distributions, with greater white Hispanic representation in Brooklyn and Queens, and greater black Hispanic in Bronx and Manhattan. These variations point to the conclusion that the borough of residence plays a significant role in shaping arrest patterns across racial groups.
For the next section, we aim to find out the overall geographical patterns.